Adaptive Monte Carlo via Bandit Allocation

نویسندگان

James Neufeld

András György

Csaba Szepesvári

Dale Schuurmans

چکیده

We consider the problem of sequentially choosing between a set of unbiased Monte Carlo estimators to minimize the mean-squared-error (MSE) of a final combined estimate. By reducing this task to a stochastic multi-armed bandit problem, we show that well developed allocation strategies can be used to achieve an MSE that approaches that of the best estimator chosen in retrospect. We then extend these developments to a scenario where alternative estimators have different, possibly stochastic costs. The outcome is a new set of adaptive Monte Carlo strategies that provide stronger guarantees than previous approaches while offering practical advantages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive strategy for stratified Monte Carlo sampling

We consider the problem of stratified sampling for Monte Carlo integration of a random variable. We model this problem in a K-armed bandit, where the arms represent the K strata. The goal is to estimate the integral mean, that is a weighted average of the mean values of the arms. The learner is allowed to sample the variable n times, but it can decide on-line which stratum to sample next. We pr...

متن کامل

Finite Time Analysis of Stratified Sampling for Monte Carlo

We consider the problem of stratified sampling for Monte-Carlo integration. We model this problem in a multi-armed bandit setting, where the arms represent the strata, and the goal is to estimate a weighted average of the mean values of the arms. We propose a strategy that samples the arms according to an upper bound on their standard deviations and compare its estimation quality to an ideal al...

متن کامل

Improving Monte Carlo Tree Search Policies in StarCraft via Probabilistic Models Learned from Replay Data

Applying game-tree search techniques to RTS games poses a significant challenge, given the large branching factors involved. This paper studies an approach to incorporate knowledge learned offline from game replays to guide the search process. Specifically, we propose to learn Naive Bayesian models predicting the probability of action execution in different game states, and use them to inform t...

متن کامل

Sequential Monte Carlo Bandits

In this paper we propose a flexible and efficient framework for handling multi-armed bandits, combining sequential Monte Carlo algorithms with hierarchical Bayesian modeling techniques. The framework naturally encompasses restless bandits, contextual bandits, and other bandit variants under a single inferential model. Despite the model’s generality, we propose efficient Monte Carlo algorithms t...

متن کامل

Model-Free Adaptive Rate Selection in Cognitive Radio Links

In this work we address the rate adaptation problem of a cognitive radio (CR) link in time-variant fading channels. Every time the primary users (PU) liberate the channel, the secondary user (SU) selects a transmission rate (from a finite number of available rates) and begins the transmission of fixed sized packets until a licensed user reclaims the channel back. After each transmission episode...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Adaptive Monte Carlo via Bandit Allocation

نویسندگان

چکیده

منابع مشابه

Adaptive strategy for stratified Monte Carlo sampling

Finite Time Analysis of Stratified Sampling for Monte Carlo

Improving Monte Carlo Tree Search Policies in StarCraft via Probabilistic Models Learned from Replay Data

Sequential Monte Carlo Bandits

Model-Free Adaptive Rate Selection in Cognitive Radio Links

عنوان ژورنال:

اشتراک گذاری